中国科学技术信息研究所--国家工程技术数字图书馆

1. Image Generation: A Review

[期刊] Mohamed Elasri Omar Elharrouss Somaya Al-Maadeed Hamid Tairi 《Neural processing letters》 2022年54卷5期共38页

摘要 : Abstract The creation of an image from another and from different types of data including text, scene graph, and object layout, is one of the very challenging tasks in computer vision. In addition, capturing images from different ... 展开

关键词 : Image generation Text-to-image generation Sketch-to-image generation Layout-to-image generation Image-to-image translation Panoramic image generation

2. Locally controllable network based on visual-linguistic relation alignment for text-to-image generation

[期刊] Zaike Li Li Liu Huaxiang Zhang Dongmei Liu Yu Song Boqun Li 《Multimedia Systems》 2024年30卷1期共13页

摘要 : Since locally controllable text-to-image generation cannot achieve satisfactory results in detail, a novel locally controllable text-to-image generation network based on visual-linguistic relation alignment is proposed. The goal o... 展开

关键词 : Text-to-image generation Image-text matching Generative adversarial network Local control

3. Cross-modal text and visual generation: A systematic review. Part 1: Image to text

[期刊] Zelaszczyk M. Mandziuk J. 《Information Fusion》 2023年93卷共28页

摘要 : We review the existing literature on generating text from visual data under the cross-modal generation umbrella, which affords us to compare and contrast various approaches taking visual data as input and producing text outputs, w... 展开

关键词 : Cross-modal learning Image captioning Image-to-text generation Multimodal learning Representation learning Text-to-image generation

原文获取

4. MRP-GAN: Multi-resolution parallel generative adversarial networks for text-to-image synthesis

[期刊] Qi, Zhongjian Fan, Chaogang Xu, Liangfeng Li, Xinke Zhan, Shu 《Pattern recognition letters》 2021年147卷Jul.期共7页

摘要 : Synthesizing photographic images from given text descriptions is a challenging problem. Although current methods first synthesize an initial blurred image, then refine the initial image to a high-quality one, the most existing met... 展开

关键词 : Text-to-image synthesize Generative adversarial networks Image generation

5. Adversarial text-to-image synthesis: A review

[期刊] Stanislav Frolov Tobias Hinz Federico Raue Jorn Hees Andreas Dengel 《Neural Networks: The Official Journal of the International Neural Network Society》 2021年144卷共23页

摘要 : With the advent of generative adversarial networks, synthesizing images from text descriptions has recently become an active research area. It is a flexible and intuitive way for conditional image generation with significant progr... 展开

关键词 : Text-to-image synthesis Generative adversarial networks

原文获取

6. TextControlGAN: Text-to-Image Synthesis with Controllable Generative Adversarial Networks OA

[期刊] Hyeeun Ku Minhyeok Lee 《Applied Sciences》 2023年13卷8期共12页

摘要 : Generative adversarial networks (GANs) have demonstrated remarkable potential in the realm of text-to-image synthesis. Nevertheless, conventional GANs employing conditional latent space interpolation and manifold interpolation (GA... 展开

关键词 : generative adversarial networks text-to-image synthesis image generation computer vision

7. Image manipulation with natural language using Two-sided Attentive Conditional Generative Adversarial Network

[期刊] Zhu, Dawei Mogadala, Aditya Klakow, Dietrich 《Neural Networks: The Official Journal of the International Neural Network Society》 2021年136卷共11页

摘要 : Altering the content of an image with photo editing tools is a tedious task for an inexperienced user, especially, when modifying the visual attributes of a specific object in an image without affecting other constituents such as ... 展开

关键词 : Generative Adversarial Network (GAN) Text-to-image generation Image manipulation

原文获取

8. Tell, Imagine, and Search: End-to-end Learning for Composing Text and image to Image Retrieval

[期刊] FEIFEI ZHANG MINGLIANG XU CHANGSHENG XU 《ACM transactions on multimedia computing communications and applications》 2022年18卷2期共23页

摘要 : Composing Text and Image to Image Retrieval (CTI-IR) is an emerging task in computer vision, which allows retrieving images relevant to a query image with text describing desired modifications to the query image. Most conventional... 展开

关键词 : Composing text and image to image retrieval end-to-end image generation generative adversarial network global-local

9. A Survey of Natural Language Generation

[期刊] Dong, Chenhe Li, Yinghui Gong, Haifan Chen, Miaoxin Li, Junxin Shen, Ying Yang, Min 《ACM Computing Surveys》 2023年55卷8期共38页

摘要 : This article offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as wel... 展开

关键词 : Natural language generation data-to-text generation text-to-text generation deep learning evaluation

10. LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation

[期刊] Deng, Zijun He, Xiangteng Peng, Yuxin 《ACM transactions on multimedia computing communications and applications》 2023年19卷6期共18页

摘要 : Text-to-image generation aims to generate images from text descriptions. Its main challenge lies in two aspects: (1) Semantic consistency, i.e., the generated images should be semantically consistent with the input text; and (2) V... 展开 Text-to-image generation aims to generate images from text descriptions. Its main challenge lies in two aspects: (1) Semantic consistency, i.e., the generated images should be semantically consistent with the input text; and (2) Visual reality, i.e., the generated images should look like real images. To ensure text-image consistency, existing works mainly learn to establish the cross-modal representations via a text encoder and image encoder. However, due to the limited representation capability of the fixed-length embeddings and the flexibility of the free-form text descriptions, the learned text-to-image model is incapable of maintaining the semantic consistency between image local regions and fine-grained descriptions. As a result, the generated images sometimes miss some fine-grained attributes of the generated object, such as the color or shape of a part of the object. To address this issue, this paper proposes a Local Feature Refinement Based Generative Adversarial Network (LFR-GAN), which first divides the text into some independent fine-grained attributes and generates an initial image, then refines the image details based on these attributes. The main contributions are three-fold: (1) An attribute modeling approach is proposed to model the fine-grained text descriptions by mapping them into representations of independent attributes, which provides more fine-grained details for image generation. (2) A local feature refinement approach is proposed to enable the generated image to form a complete reflection of the fine-grained attributes contained in the text description. (3) A multi-stage generation approach is proposed to realize the fine-grained manipulation of complex images progressively, which aims to improve the performance of the refinement and generate photo-realistic images. Extensive experiments on the CUB and Oxford102 datasets show the effectiveness of our LFR-GAN approach in both text-to-image generation and text-guided image manipulation tasks. Our LFR-GAN approach shows superior performance compared to the state-of-the-art methods. The codes will be released at https://github.com/PKU-ICST-MIPL/LFR-GAN_TOMM2023. 收起

关键词 : Local feature refinement text-to-image generation generative adversarial network